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(57) Abstract 

A solid support based hybridization assay is provided which allows for the systematic and reproducible analysis of repeat and 
tandem repeat oligonucleotide sequences of DN A and RNA by hybridization to a reverse dot blot array comprising strings of such repeats 
complementary to those found in particular nucleic acid targets (e.g., analyie PCR product). An addressable library (i.e., an indexed set) 
of complementary repeats is synthesized on a suitable support. Preferably, the support comprises a low fluorescent background support, 
thereby facilitating the use of non-radioisotopic modes of detection (such as fluorescence of chemiluminesccnce); particularly suitable in 
this regard is an aminated polypropylene support or similar material. Preferred arrays permit screening of DNA and RNA samples for 
complete sets of particular types of nucleotide repeat sequences (e.g., all nucleotide doublet or triplet repeats). 
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OLIGONUCLEOTIDE REPEAT ARRAYS 

Background of the Invention 

The present invention relates generally to the 
fields of biochemistry and medicine. In particular, the 
5 invention is directed to materials and methods useful in 
the diagnosis of genetic mutations of clinical relevance. 

Short tandem repeats (STR) have been identified in 
a number of genes. It has been proposed that particular 
unstable triplet repeat oligonucleotides are correlated 

10 with a number of genetic diseases in humans, including 
Kennedy's disease [La Spada, A. et al., Nature, 352, 77- 
79 (1991)], fragile-X syndrome [Verkerk, A.J.M.H. et al.. 
Cell 65, 905-914 (1991)], myotonic dystrophy [Fu, Y.H. et 
al. Science 255, 1256-1258 (1992)], Huntington disease 

15 [The Huntington's Disease Collaborative Research Group, 
Cell 72, 971-983 (1993)] and spinocerebellar ataxia type 
1 [Orr, H.T. et al., Nature Genet. 4, 221-226 (1993)]. 
Similarly, doublet repeats have also been reported to be 
associated with particular disease states; for example, 

20 correlations have been proposed with cystic fibrosis 
[Chu, C.-S. et al.. Nature Genetics 3, 151-156 (1993)] 
and colorectal cancer [Thibodeau, S.N. et al.. Science 
260, 816-819 (1993)]. Higher-order repeats, such as 
tetramers [see, e.g.. Gen, M.W. et al., Genomics 17, 770- 

25 772 (1993)], have also been identified. 

One gene which has been subject of intense scrutiny 
is the Huntington's disease gene. The trinucleotide 
hybridization approach was recently utilized to map out 
tandem repeats across a section of the gene. In this 

'30 section, 51 triplet repeats spanning a 1.86 Mbp DNA 
segment were identified by Southern transfer of 
restriction enzyme digests of a specific cosmid and 
probing with ^^P-labelled oligonucleotide probes 
[Hummerich, et al., "Distribution of trinucleotide repeat 

35 sequences across a 2 Mbp region containing the 
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Huntington's disease gene," Human Molecular Genetics 3, 
73 (1994)] . 

DNA polymorphisms which arise from allelic 
differences in the number of repeats have been identified 
5 by such terminology as short tandem repeats (STR) , 
variable number of tandem repeats (VNTR) , minisatellites 
(tandem repeats of a short sequence, originally defined 
as 9-60 bp) and microsatellites (originally defined as 1- 
5 bp) [McBride, L.J. & O'Neill, M.D., American 

10 Laboratory, pp. 52-54 (November 1991)]; minisatellites 
and microsatellites would be considered subclasses of the 
VNTR. It is estimated that there are up to 500,000 
microsatellite repeats distributed throughout the human 
genome, at an average spacing of 7000 bp. Therefore, it 

15 is apparent that most genes will contain VNTR regions and 
that these regions can be used as genetic markers. For 
example, VNTRs are currently being used as markers in 
studies concerned with the inheritance of certain 
mutations leading to various forms of cancer. Recently, 

20 it has been discovered that certain triplet repeat 
expansions are associated with a predisposition towards 
certain diseases; a large expansion is typically 
associated with the onset of the disease. For example, 
the (CGG) triplet repeat region associated with Fragile 

25 X occurs at a frequency of 10-50 repeat units in the 
normal population, while in those afflicted with the 
disease the expansion is between 200-2000 repeats. 

As it becomes possible to determine whether a 
particular genotype comprises an unstable repeat and/or 

30 is associated with a particular disease state, there is 
a considerable incentive to develop useful methods to 
characterize STRs. The heretofore available methods for 
initial scanning for STRs have generally required time- 
consuming sequential oligonucleotide hybridizations to 

35 filter-bound target DNAs to identify specific STRs [see, 
e.g., Litt, M. and Luty, J.A. , Am. J. Hum. Genet. 44, 
397-401 (1989); Weber, J.L. and May, P.E., Am. J. Hum. 
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Genet. 44, 388-396 (1989); Fu et al., supra]. In 
particular, the analysis of oligonucleotide repeats is 
typically carried out at the present time by Southern 
blotting of restriction fragments followed by 
5 hybridization analysis using a specified repetitive 
sequence probe. Alternatively, it is possible to probe 
dot blots of the target DNA [lizuka, et al., GATA 10:2-5 
(1993)] . 

Both of these heretofore -knovm techniques are time- 
10 consuming and tedious for large sample populations. 
Moreover, multiple probings may be required to identify 
which repeat might be present. Further, it is often 
difficult to reproducibly spot or transfer equivalent 
amounts of DNA to these supports; thus, conventional dot 
15 blots and transfers show variation in signal intensity 
from batch to batch. In addition, any regions of DNA 
that might become cross-linked to the support (e.g., 
through UV light) would be inaccessible to probes. 

It would be highly useful for clinical investigators 
20 to be able to screen large sample populations of patients 
DNAs in an effective manner. As additional STRs are 
identified and associated with particular conditions, the 
need for simple and effective screening methods becomes 
greater . 

25 PCT published application No. WO 89/10977 describes 

methods and apparatus for analyzing polynucleotide 
sequences in which an array of the whole or a chosen part 
of a complete set of oligonucleotides are bound to a 
solid support. The different oligonucleotides occupy 

30 separate cells of the array and are capable of taking 
part in hybridization reactions. For studying 

differences between polynucleotide sequences, the array 
may comprise the whole or a chosen part of a complete set 
of oligonucleotides comprising the polynucleotide 

35 sequences. While it is suggested that a small array may 
be useful for many applications, such as the analysis of 
a gene for mutations, there is no teaching or suggestion 
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of a specific array or method for using same which would 
permit the rapid and accurate screening of a wide range 
of biological materials for tandem repeats. Moreover, 
the arrays described in WO 89/10977 are designed 
5 specifically for use in sequencing by hybridization; the 
presence of long tandem nucleotide repeats can present a 
significant problem in attempts to sequence a sample 
using the methods described in WO 89/10977. 

It is an object of the present invention to provide 
10 methods and apparatus for rapid and accurate 
identification of nucleotide tandem repeats in DNA and 
RNA sequences from a wide variety of sources. 

Summary of the Invention 

15 In accordance with the present invention, a solid 

support based hybridization assay is provided which 
allows for the systematic and reproducible analysis of 
repeat and tandem repeat oligonucleotide sequences of DNA 
and RNA by hybridization to a reverse dot blot array 

20 comprising strings of such repeats complementary to those 
found in particular nucleic acid targets (e.g., analyte 
PGR product). An addressable library (i.e., an indexed 
set) of complementary repeats is synthesized on a 
suitable support. Preferably, the support comprises a 

25 low fluorescent background support, thereby facilitating 
the use of non- radioisotopic modes of detection (such as 
fluorescence or chemiluminescence) ; particularly suitable 
in this regard is an aminated polypropylene support or 
similar material. Pursuant to a preferred embodiment of 

30 the invention, arrays are provided which permit screening 
of DNA and RNA samples for complete sets of particular 
types of nucleotide repeat sequences (e.g., all 
nucleotide do\iblet or triplet repeats) . 
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Brief Description of the Drawings 

The invention may be better understood with 
reference to the accompanying drawings, in which: 

Fig. 1 illustrates a pattern for synthesis of 
5 leader sequences and tandem repeat patterns; 

Figs. 2(A) - 2(C) illustrate hybridization of 
arrayed dinucleotide and trinucleotide 
oligonucleotide repeats using ^^P labelled DNA of 
cosmid 22-3 as a probe, in which Fig. 2(A) 
10 represents hybridization stringency at 40°C, Fig. 

2(B), hybridization stringency at 50^C , and Fig. 
2(C), hybridization stringency at 60°C; and 

Fig. 3 illustrates the type and position of 
STRs indicated by the STR-Strips in 34,977 bp of 
15 cosmid 22.3. 



Detailed Description of the Invention 

In accordance with the present invention, defined 
repeat and tandem repeat arrays for use in screening 

20 nucleic acid targets for the presence of genetic markers 
generally known as variable number [of] tandem repeats 
(VNTRs) are synthesized on a suitable support. After 
hybridization of the target materials with the array, the 
identity of any tandem repeat sequence (s) in the target 
• 25 materials may be readily ascertained by observing the 
location (s) at which binding occurs. Pursuant to the 
present invention, probes are reproducibly synthesized on 
the surface, freely accessible to target DNA. Moreover, 
all hybridizations can be rapidly identified under a 

30 limited number of stringency conditions. 

The arrays of the present invention comprise a 
predetermined set of oligonucleotides attached to the 
surface of the solid support. One particularly useful 
class of tandem repeats for arrays in accordance with the 

35 present invention comprises the complete class of 60 
tandem triplet repeats (i.e., all possible triplet 
combinations minus the four homopolymer combinations) . 
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Another useful class of tandem repeats is the complete 
class of 6 tandem doublet repeats (i.e., the 10 possible 
doublet combinations of the nucleic acids A, C, G and T 
minus the four homopolymer combinations) . Of course, 
5 those skilled in the art would readily appreciate that a 
. wide range of different combinations of repeat elements 
could also be employed in accordance with the present 
invention. For example, repeats of a higher order (i.e., 
repeats of four or more nucleotides) may be useful in 

10 some instances. In addition, particular subclasses of 
any complete class of all possible tandem repeats of a 
given size may be suJ.table for carrying out particular 
types of screenings For purposes of the present 
invention, all predet-^rmined sets of tandem repeats are 

15 contemplated as within the scope of the present 
invention. 

The sequences forming the array may be directly 
linked to the supporc. In other embodiments of the 
arrays of the present invention, the repeat units may be 

20 attached to the support by non- repetitive sequences of 
oligonucleotides or other molecules serving as spacers or 
linkers to the solid support. In preferred examples of 
this embodiment, specific leader sequences are encoded on 
either side of the tandem repeat region in an array 

25 format. Depending upon the relative position of the 
leader sequence a PGR or sequencing primer may be 
designed. Such primers may then be used to aid in the 
characterization of the length of the tandem repeat 
and/or the specific flanking sequences, respectively. In 

30 general, a triplet tandem repeat sequence of sufficient 
length effectively defines two additional tandem repeat 
sequences; for example, a 21mer complementary to (ACG)n 
also hybridizes to (GAC)n and (CGA)„. By systematically 
including a degenerate set of leader sequences while 

35 reducing the size of the tandem repeat region, 
hybridization stringency is increased to allow for 
identification of the combination of leader plus the 
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tandem repeat; in the example, selectivity of CCC ACG ACG 
ACG [SEQ ID N0:1] would be observed over, e.g., CCC GAC 
GAC GAC [SEQ ID N0:2] • Fig. 1 illustrates a set of 
instructions for synthesis of suitable leader sequences 
5 for triplet tandem repeats. Such leader arrays are 
particularly advantageous for the purpose of identifying 
leader sequences for use as PCR primers to tandem repeat 
regions . 

The method of the present invention is generally 

10 applicable to a wide range of tandem repeat patterns, 
including higher order tandem repeats. As by definition 
a tandem repeat consists of at least 2 units of a given 
oligomer (for example, a dimer or 2mer) , then a (2mer)n 
wherein n = 2, 3, 4, ... would represent a dinucleotide 

15 repeat forming a 4mer, 6mer, 8mer, etc. (e.g., ACAC, 
ACACAC, ACACACAC, etc.). Similarly, a triplet repeat 
would be defined as a (3mer)n and a tetramer repeat as 
(4mer)„, wherein n represents the number of repeats 
present. Contemplated as within the scope of the present 

20 invention are all tandem repeats of the general formula 
(Nmer)n wherein N is an integer greater than 1 
representing the number of nucleotides in the repeat 
pattern and n is an integer representing the number of 
times the pattern is repeated; in general, the product of 

25 N and n is in the range of 4 to about 100, and preferably 
6 to about 60. 

Higher order tandem repeat combinations representing 
combinations of two or more individual tandem repeats are 
also contemplated as within the scope of the present 

30 invention; for example, such higher order tandem repeat 
combinations may include two dimer patterns, a dimer and 
a triplet, two triplets, etc. In general terms, such 
repeat combinations may be described as 

(Nmer)„(Mmer)„ 

35 in which N and M are independently selected integers 
greater than 1 and represent the number of nucleotides in 
the respective repeat pattern and n and m are 
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independently selected integers and represent the number 
of times the respective pattern is repeated. In the case 
of three tandem patterns, the structure may be 
represented as 
5 (Nmer ) „ (Mmer ) „ ( Pmer ) p 

in which P is defined in the same manner as N and M, and 
p in the same manner as h and m. Moreover, these higher 
order tandem repeat combinations may also be found in a 
repeat pattern; such a complex higher order tandem repeat 
10 combination may be described as 

[(Nmer)„{Mmer) J]^ 

or 

[ (Nmer) n (Mmer) „ (Pmer) p] 3, 
in which N, M, P, n, m and p are as previously defined 

15 and x is an integer which represents the number of times 
the [ (Nmer) „ (Mmer) „)] or [ (Nmer) „ (Mmer) „ (Pmer) p] pattern is 
repeated. Contemplated as within the scope of the 
present invention are those combinations wherein x(Nn + 
Mm) or x(Nn + Mm + Pp) is in the range of 4 to about 100, 

20 preferably in the range of 6 to about 60. In the 
simplest case comprising two repeat patterns and wherein 
X is 1, N and M are both 2 and n and m are both 1; 1x2 
+1x2=4. Table 1 illustrates the construction of 
higher order repeats in which n and m are both 2, 

25 TABLE 1 



30 



Tandem 
Repeat 


(2mer) 

n 


(3mer) 

n 


(4mer)„ 


{5mer)„ 


(6mer)n 


(7mer)n 


(2iner)„ 


8mer 


lOmer 


12mer 


14iner 


16mer 


ISmer 


(3mer)„ 


lOmer 


12mer 


14iner 


16mer 


IBmer 


20mer 


(4mer)^ 


12mer 


14mer 


16mer 


ISmer 


20mer 


22mer 


(5mer)„ 


14mer 


IGmer 


ISmer 


20mer 


22mer 


24mer 


( enter) „ 


ISmer 


18mer 


20mer 


22mer 


24mer 


26mer 


(7mer)„ 


18mer 


20mer 


22mer 


24mer 


26mer 


28mer 



35 

Thus, for example, the' 8mer comprising a {2mer)„ = (AC) 2 
and a (2mer)„ = (AT) 2 would have the formula ACACATAT; 
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similarly, the 12Tner comprising a Omer)^ = (ACOj and a 
(3iner)„ = (ATOs would have the formula ACGACGATCATC [SEQ 
ID NO: 3] . A complex higher order repeat [(AC)n(AT)Jx in 
which n and m are 1 and x is 2 would have the formula 
5 ACATACAT; where x is 3, the formula would be ACATACATACAT 
[SEQ ID N0:4] . Those skilled in the art would readily 
appreciate the variety of simple and higher-order tandem 
repeat patterns falling within the scope of the present 
invention. 

10 The methods and apparatus in accordance with the 

present invention take advantage of the fact that under 
appropriate conditions oligonucleotides form base paired 
duplexes with oligonucleotides which have a complementary 
base sequence. The stability of the duplex is dependent 

15 on a number of factors, including the length of the 
oligonucleotides, the base composition, and the 
composition of the solution in which hybridization is 
effected. The effects of base composition on duplex 
stability may be reduced by carrying out the 

20 hybridization in particular solutions, for example in the 
presence of high concentrations of tertiary or quaternary 
amines . 

The thermal stability of the duplex is also 
dependent on the degree of sequence similarity between 

25 the sequences. By carrying out the hybridization at 
temperatures close to the anticipated Tm's of the type of 
duplexes expected to be formed between the target 
material (s) and the oligonucleotides bound to the array, 
the rate of formation of mismatched duplexes may be 

30 substantially reduced. 

The number of repeats in the tandem sequences 
attached to the array may vary over a broad range from 
the minimum of two necessary to constitute a repeat to a 
maximum on the order of about 50. Of course, the optimum 

35 range for the number of tandem repeats in any given 
instance is dependent upon a number of factors, including 
in particular the composition and the length of the 
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repeat. In general, the Tm of the complex formed between 
a given sequence in the target material and the 
complementary sequence in the array increases as the 
length of the sequences increase; however, only minor 
5 increases in Tm are observed once the sequences have 
reached a length of about 50-60 bases. The sequences on 
the array contain at least four bases (the minimum for a 
repeat of a doublet pattern) . It is generally preferred 
that the sequences on the array comprise at least about 

10 6 bases, more preferably at least about 10 bases, and 
most preferably on the order of about 15 to about 60 
bases. As previously noted, there is little practical 
advantage in using sequences substantially longer than 
about 60 bases; nonetheless, extended sequences of up to 

15 about 100 bases in length (corresponding to, e.g., 50 
repeats of a doublet sequence) and longer are also 
contemplated as within the scope of the present 
invention. 

In addition, in accordance with preferred 

20 embodiments of the invention the length of each sequence 
employed in the array may be selected to as to optimize 
binding of target materials to the array. For any given 
tandem repeat sequence, an optimum length for use with 
any particular target material under specified screening 

25 conditions may be determined empirically. Thus, the 
length for each individual element of the set of tandem 
repeats comprising the array may be optimized for the 
screening of particular target materials under specific 
conditions (e.g., at a given temperature). 

30 A wide variety of array formats may be employed in 

accordance with the present invention. One particularly 
useful format is a linear array of oligonucleotide bands, 
generally referred to in the art as a dipstick. Another 
suitable format comprises a two-dimensional pattern of 

35 discrete cells (e.g., 4096 squares in a 64 by 64 array).. 
Of course, as would be readily appreciated by those 
skilled in the art, other array formats (e.g., circular) 
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would be equally suitable for use in accordance with the 
present invention. While arrays may be prepared on a 
variety of materials including glass plates, it is 
presently preferred to use an organic polymer medium. As 
5 used herein, the term "organic polymer" is intended to 
mean a support material which is most preferably 
chemically inert under conditions appropriate for 
biopolymer synthesis and which comprises a backbone 
comprising various elemental substituents including, but 

10 not limited to, hydrogen, carbon, oxygen, fluorine, 
chlorine, bromine, sulfur and nitrogen. Representative 
polymers include, but are not limited to, the following: 
polypropylene , polyethylene , polybutylene , 
polyisobutylene , polybutadiene , polyisoprene , 

15 polyvinylpyrrolidone , polytetraf luoroethylene , 
polyvinyl i dene difluoride, polyf luoroethylene -propylene, 
polyethylene-vinyl alcohol , polymethylpentene , 
polychlorotrif luoroethylene, polysulfones, and blends and 
copolymers thereof. As used herein, the term "medium" is 

20 intended to mean the physical structural shape of the 
polymer. Thus, medium can be generally defined as 
polymer films (i.e., polymers having a substantially non- 
porous surface); polymer membranes (i.e., polymers having 
a porous surface); polymer filaments (e.g., mesh and 

25 fabrics); polymer beads ; polymer foams; polymer frits; 
and polymer threads. Preferably, the polymer medium is 
a thread, membrane or film; most preferably, the polymer 
medium is a film. An exemplary organic polymer medium is 
a polypropylene sheet having a thickness on the order of 

30 about 1 mil (0.001 inch), although the thickness of the 
film is not critical and may be varied over a fairly 
broad range. Particularly preferred for preparation of 
arrays at this time are biaxially oriented polypropylene 
(BOPP) films; in addition to their durability, BOPP films 

35 exhibit a low background fluorescence. 

The array formats of the present invention may be 
included in a variety of different types of device. As 
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used herein, the terra "device" is intended to mean any 
device to which the solid support can be affixed, such as 
microtiter plates, test tubes, inorganic sheets, 
dipsticks, etc. For example, when the solid support is 
5 a polypropylene thread, one or more polypropylene threads 
can be affixed to a plastic dipstick- type device; 
polypropylene membranes can be affixed to glass slides. 
The particular device is, in and of itself, unimportant. 
All that is necessary is that the solid support can be 

10 affixed thereto without affecting the functional behavior 
of the solid support or any biopolymer adsorbed thereon, 
and that the device is stable to any materials into which 
the device is introduced (e.g., clinical samples, etc.). 
The arrays of the present invention may be prepared 

15 by a variety of approaches which are known to those 
working in the field. Pursuant to one type of approach, 
the complete sequences are synthesized separately and 
then attached to a solid support. However, it is 
presently considered particularly advantageous to 

20 synthesize the sequences directly onto the support to 
provide the desired array. Suitable methods for 
covalently coupling oligonucleotides to a solid support 
and for directly synthesizing the oligonucleotides onto 
the support would be readily apparent to those working in 

25 the field; a summary of suitable methods may be found in, 
e.g., Matson, R.S. et al., Analytical Biochem. 217, 306- 
310 (1994), hereby incorporated by reference. 
Advantageously, the oligonucleotides are synthesized onto 
the support using conventional chemical techniques as 

30 heretofore employed for preparing oligonucleotides on 
solid supports comprising, e.g., controlled pore size 
glass (CPG) , as described for example in PCT applications 
WO 85/01051 and WO 89/10977, or polypropylene, as 
described in copending U.S. patent application Serial No. 

35 07/091,100, which has been assigned to the assignee of 
the present application and is herein incorporated by 
reference. Pursuant to one preferred approach, a 
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polypropylene support (for example, a biaxially- oriented 
polypropylene) is first surface aminated by exposure to 
an ammonia plasma generated by radiofrequency plasma 
discharge. The reaction of a phosphoramidite- activated 
5 nucleotide with the aminated membrane followed by 
oxidation with, e.g., iodine provides a base stable 
amidate bond to the support. 

As described in U.S. patent application Serial No. 
08/144,954 filed October 28, 1993, which has been 

10 commonly assigned to the assignee of the present 
invention and is incorporated by reference herein, a 
suitable array may advantageously be produced using 
automated means to synthesize oligonucleotides in the 
cells of the array by laying down the precursors for the 

15 four bases in a predetermined pattern. Briefly, a 
multiple -channel automated chemical delivery system is 
employed to create oligonucleotide probe populations in 
parallel rows (corresponding in number to the number of 
channels in the delivery system) across the substrate. 

20 Following completion of oligonucleotide synthesis in a 
first (1®) direction, the substrate may then be rotated 
by 90® to permit synthesis to proceed within a second 
(2®) set of rows that are now perpendicular to the first 
set.. This process creates a multiple -channel array whose 

25 intersection generates a plurality of discrete cells. 
Table 2 describes an exemplary vertical array of 64 
oligonucleotides consisting of 60 triplet tandem repeat 
sequences {21mers) and dinucleotide tandem repeat 
sequences (20mers) • 

30 For the example of the preferred array comprising 

specific leader sequences as described in Fig. 1 (and as 
more fully described in the above-noted U.S. patent 
application Serial No. 08/144,954), all of the degenerate 
triplet repeats (n=64, including homopolymers) are 

35 synthesized in a first direction (1® synthesis) in the 64 
channels in Cycles 1-3 as 3' Leader Sequences (LLL) . For 
example, lane 1 is AAA, lane 2 AAC, lane 5 ACA and lane 
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64 TTT. Then, the film is rotated 90° to perform 
synthesis in the second direction (2° synthesis or cross- 
synthesis) in Cycles 4-15. All 64 triplet tandem repeat 
sequences of (NNN)4 are then produced. The 2-dimensional 
5 array created thereby is the product of a bidirectional 
synthesis and comprises 4096 discrete cells containing 
15mer oligonucleotide products (LLL) (NNN)4, in which the 
Leader Sequence is placed at the 3 '-end of each completed 
oligonucleotide. Thus, the following sequences would be 
10 found at the indicated cell positions: 

Cell 1,1' (AAA) (AAA) 4 [SEQ ID NO: 5] 

Cell 1,2' (AAA)(AAC)4 [SEQ ID N0:6] 

Cell 5,1' (ACA) (AAA) 4 [SEQ ID NO: 7] 

Cell 64,1' (TTT)(AAA)4 [SEQ ID N0:8] 

15 Cell 64,64' (TTT) (TTT) 4 [SEQ ID NO:93 

This type of array comprising leader sequences (at either 
the 5' or 3' end) is particularly preferred in accordance 
with the present invention. 

In order to accommodate a suitably large array, the 
20 pixel size should be as small as possible. Cells having 
a width on the order of about 10 ^m to about 1 mm would 
be particularly suitable. In one preferred embodiment of 
the invention, arrays with a pixel width of about 500 ^m 
are prepared on biaxially-oriented polypropylene. 
• 25 Pursuant to the present invention, there are also 

provided methods for screening DNA and RNA samples 
comprising labelling the samples to form labelled 
material, applying the labelled material under suitable 
hybridization conditions to an array as described herein, 
30 and observing the location of the label on the surface 
associated with particular members of the set of 
oligonucleotides. Identification of the cell(s) in which 
binding occurs permits a rapid and accurate 
identification of any nucleotide repeats present in the 
35 sample from which the probes are derived. 

In a hybridization reaction in accordance with the 
present invention, the array is explored by the labelled 
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target material in essentially the same manner as a 
labelled probe is employed to screen, e.g., a DNA library 
containing a gene complementary to the probe. The target 
material may suitably comprise labelled sequences 
5 amplified from genomic DNA by the polymerase chain 
.reaction (PGR), a mRNA population, or a partial or 
complete set of oligonucleotides from one or more 
chromosomes or an entire genome. To prepare the target 
material, the sample may be degraded to form fragments; 
10 where appropriate, the degraded material may then be 
sorted (for example, by electrophoresis on a sequencing 
gel) to provide a set of oligomers having a specific 
length. 

The target material is then labelled to facilitate 

15 detection of duplex formation. Suitably, conventional 
methods for end-labelling of oligomers are employed. 
Both radioactive and fluorescent labelling methods would 
be suitable for use in accordance with the present 
invention. Commonly- employed techniques routinely permit 

20 the introduction of label into a significant fraction of 
the target materials. Using conventional methods for 
labelling oligomers with ^^P, for example, the radioactive 
yield of any individual oligomer even from a total human 
genome could be more than 10* dpm/mg of total DNA. For 

25 detection, only a small fraction of the labelled material 
would be necessary for hybridization to a pixel of a size 
within the preferred range specified herein. 
Hybridization conditions for a given combination of array 
and target material can routinely be optimized in an 

30 empirical manner to be close to the Tm of the expected 
duplexes, thereby maximizing the discriminating power of 
the method. Autoradiography (in particular, with ^^P) may 
cause image degradation which may be a limiting factor 
determining resolution; the limit for silver halide films 

35 is about 25 fim. Accordingly, the use of fluorescent 
probes (in particular, in conjunction with an array 
prepared on a low- fluorescence solid support) is 
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presently preferred. In view of the low background 
fluorescence of the preferred biaxially-oriented 
polypropylene substrate, fluorescence -based labelling 
techniques may advantageously be employed with arrays on 
5 such a substrate. With either type of labeled target 
material, the substantial excess in bound 
oligonucleotides of the array makes it possible to 
operate at conditions close to equilibrium with most 
types of target materials contemplated herein.. 

10 As would be readily understood by those skilled in 

the art, the chosen conditions of hybridization must be 
such as to permit discrimination between exactly matched 
and mismatched oligonucleotides. Hybridization 
conditions may be initially chosen to correspond to those 

15 known to be suitable in standard procedures for 
hybridization to filters and then optimized for use with 
the arrays of the present invention; moreover, conditions 
suitable for hybridization of one type of target material 
would appropriately be adjusted for use with other target 

20 materials for the same array. In particular, it is 
appropriate to control temperature closely (preferably, 
to better than about ±1° C) to substantially eliminate 
formation of duplexes between sequences other than 
identical sequences. Particularly when the length of the 

25 oligonucleotide in the target materials is small, it is 
necessary to be able to distinguish between slight 
differences in the rate and/or extent of hybridization. 

A variety of heretofore known hybridization solvents 
may suitably be employed, the choice of solvent for 

30 particular hybridizations being dependent on a number of 
considerations. For example, G:C base pairs are more 
stable than A:T base pairs in 1 M NaCl; thus, the Tm of 
double -stranded oligonucleotides with a high G + C 
content will be higher than corresponding 

35 oligonucleotides with a high A + T content. These 
effects are of course particularly pronounced in 
sequences comprising tandem nucleotide repeats. In order 



wo 95/30774 



PCTAJS95/04899 



-17- 

to compensate for this discrepancy, a variety of 
approaches may be employed. For example, the amount of 
oligonucleotide applied to the surface of the support may 
be varied in dependence on the nucleotide composition of 
5 the bound oligomer. Further, computer means employed to 
analyze data from hybridization experiments may be 
programmed to make compensations for variations in 
nucleotide compositions. Another expedient (which may be 
employed instead of or in addition to those already 

10 mentioned) is the use of a chaotropic hybridization 
solvent, such as a ternary or quaternary amine. In this 
regard, tetramethylammonium chloride (TMACl) at 
concentrations in the range of about 2 M to about 5.5 M 
is particularly suitable; at TMACl concentrations around 

15 3.5 to 4 M, the Tm dependence on nucleotide composition 
is substantially reduced. In addition, the choice of 
hybridization salt has a major effect on overall 
hybridization yield; for example, TMACl at concentrations 
up to 5 M can increase the overall hybridization yield by 

20 a factor of up to 30 or more (depending to some extent on 
the nucleotide composition) compared to 1 M NaCl. 
Finally, as previously noted, the length of the 
oligonucleotides attached to the array may be varied so 
as to optimize hybridization under the particular 

25 conditions employed. As previously noted, it would be a 
routine matter for those working in the field to optimize 
hybridization conditions for any given combination of 
target materials and array. 

Hybridization is typically carried out with a very 

30 large excess of the bound oligonucleotides over what is 
found in the target. In preferred embodiments of the 
invention, it is possible in some cases, to distinguish 
between hybridization involving single and multiple 
occurrences of the target sequence, as yield is 

35 proportional to concentration at all stages in the 
reaction. 
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In accordance with another embodiment of the present 
invention, an array as described herein may be employed 
to selectively isolate and size variable number of tandem 
repeats (VNTRs) . This is accomplished by preparing a 
5 sample comprising VNTRs in a manner known per se [see, 
e.g., McBride & O'Neill, supra], capturing the VNTRs on 
the array, selectively' dissociating the hybrid and 
eluting the VNTR from the support, A selected lane from 
the array may be cut out of the support, the ssDNA eluted 

10 therefrom, the number of copies thereof increased by PGR 
amplification and size analysis conducted by a 
conventional technique (e.g., gel electrophoresis against 
DNA size markers) . The presence of large molecular 
weight strands would indicate an increase in mutational 

15 frequency (i.e., higher orders of tandem repeat regions) . 

The invention may be better understood with 
reference to the accompanying example that is intended 
for purposes of illustration only and should not be 
construed as in any sense limiting the scope of the 

20 present invention as defined in the claims appended 
hereto. 
Example 

Oligonucleotides were synthesized directly from 
monomers onto a 6.6 x 6.6 cm sheet of aminated 

25 polypropylene substrate using standard CED- 
phosphoramidite chemistries. A specially designed 64 
channel chemical delivery system (Southern Array Maker™, 
Beckman Instruments) as described in co-pending U.S. 
patent application Serial No. 08/144,954 was utilized to 

30 prepare the discrete oligonucleotide sequences in 
parallel rows across the polypropylene substrate. 
Polypropylene film was surface aminated by a 
radiofrequency discharge into ammonia gas as described in 
Matson et al., supra. The plasma-aminated film was then 

35 placed in the synthesizer. Standard phosphoramidite 
chemistry was performed in each of the 64 channels to 
create 64 different oligonucleotide sequences on the 



wo 95/30774 



PCTAJS95/04899 



-19- . 



film. The substrate was then cut into 0.5 cm widths 



panel of 64 tandem repeat sequences. For the present 
study 60 trinucleotide (21mers) and 4 dinucleotide tandem 
5 (20mers) repeat sequences were arrayed on a vertical 
order shown in Table 2. The arrayed trinucleotide repeat 
set represents all triplet frames except (A7^)7 [SEQ ID 
NO:10], (CCC)7 tSEQ ID NOrll] , (GGG)^ [SEQ ID N0:12] , and 
(TTT)7 [SEQ ID NO: 13] in 3'-->5' direction as well as 

10 minus strand orientation. 

In order to confirm that all sequences were fully 
represented on the panel a series of complementary probes 
were prepared that would verify each sequence by row 
position on the strip. As each triplet repeat (n) of a 

15 sufficient length in fact reads the n-1 and n-2 frames as 
well, only 20 probe sequences were required to identify 
the 60 triplet repeats on the panel. For example, row 2 
containing the sequence (AAC), [SEQ ID NO: 14] also 
represents (ACA)^ [SEQ ID NO: 15] and (CAA)6 [SEQ ID NO: 16] 

20 (sequences found at row 5 and row 17, respectively) ; 
thus, a single tandem repeat probe which has been labeled 
will identify the sequences at row positions 2, 5 and 17. 
Four additional probe sequences were required to identify 
the 4 doublet tandem repeats synthesized on the strip. 

25 It is also possible to combine the sequences of non- 
complementary probes (i.e., those that will not self- 
hybridize or form hairpin loops) to reduce the total 
number of probes necessary to read all row positions. 
The following ^^P 5^ -end labeled probes were prepared and 

30 used to identify all row positions 1-64 (listed in Table 
2) on the panel: 



perpendicular to the oligonucleotide rows to produce a 



35 



(TGOe; 

(TGG)4(TG)6(TTA),; 
(TGG)4(TGA)4; 



(TCOg; 
(GGOs; 



(CG)4(GAC)4; 



[SEQ ID NO: 17] 
[SEQ ID NO: 18] 
[SEQ ID NO: 19] 
[SEQ ID NO: 20] 
[SEQ ID NO: 21] 
[SEQ ID NO: 22] 
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{TCA)4(TAA)4(TAC)4; [SEQ ID NO: 23] 

{CCA)4{CAA)4; [SEQ ID NO:24] 

(TTC)4{TC)6(TCC)4; [SEQ ID NO: 25] 

(GAA)4(GA)6{GGA)4; and [SEQ ID NO: 26] 
5 (GCC)4(GCA)4 [SEQ ID NO: 27] . 

A number of different test DNAs were employed. Tliis 
included a 5'-(CAG)7-3' oligonucleotide tSEQ ID NO: 28] and 
200 bp PGR fragments containing a trinucleotide short 
tandem repeat of (CTG)ii [SEQ ID NO:29] [Fu, Y.H., et al, 
10 Science 255, 1256-1258 (1992)] generated from liuman 
genomic DNA of a wild type individual. An 800 bp PGR 
fragment containing a {CAG)io [SEQ ID NO: 30] repeat and a 
3.0 Icb PGR fragment containing a tandemly combined 
(GGA)e+(GGG)4 repeat [SEQ ID N0:31] were generated from 
15 cDNA clones G13 and A12, respectively, recently isolated 
in a new cDNA identification system [Lee, G.C./ et al.. 
Am. J. Hum. Genet. 53 (Suppl.), 1321 (1993)]. DNA 
samples of the cosmid MDY2 [Fu, Y.H., et al, Science 255, 
1256-1258 (1992)] containing the entire myotonin protein 
20 kinase gene and cosmid 22.3 containing the FMR-1 gene 
[Verlcerlc, A.J.M.H., et al.. Cell 65, 905-914 (1991)] have 
also been utilized as target materials to evaluate the 
test strips. 

STRs were identified using the GCG sequence analysis 
25 software paclcage (Genetics Computer Group, Inc. 1991). 
11,613 bp of cosmid MDY2 (GenBanlc accession L00727) and 
61,612 bp of a contiguous sequence containing the 
complete 34,977 bp sequence of cosmid 22.3 were searched 
for STRs. 
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TABLE2 

Vertical array of 64 oligonucleotides consisting of 60 triplet tandem repeat sequences (21mers) 
and dinucleotide tandem repeat sequences (20mers) on a polypropylene substrate 



5 




Oligonucleotides 


Seq ID 




Oligonucleotides 


Seq ID 




# 


3' - - • > 5' 


No. 


# 


31 ... > 5. 


No. 




1 


AC AC AC AC AC AC AC AC AC AC 


32 


33 


GAA GAA GAA GAA GAA GAA GAA 


62 




2 


AAC AAC AAC AAC AAC AAC AAC 


14 


34 


GAC GAC GAC GAC GAC GAC GAC 


63 




3 


AAG AAG AAG AAG AAG AAG AAG 


33 


35 


GAG GAG GAG GAG GAG GAG GAG 


64 


10 


4 


AAT AAT AAT AAT AAT AAT AAT 


34 


36 


GAT GAT GAT GAT GAT GAT GAT 


65 




5 


ACA ACA ACA ACA ACA ACA ACA 


35 


37 


GCA GCA GCA GCA GCA GCA GCA 


66 




6 


ACC ACC ACC ACC ACC ACC ACC 


36 


38 


GCC GCC GCC GCC GCC GCC GCC 


67 




7 


ACG ACG ACG ACG ACG ACG ACG 


37 


39 


GCG GCG GCC GCG GCC GCG GCG 


68 




8 


ACT ACT ACT ACT ACT ACT ACT 


38 


40 


GCT GCT GCT GCT OCT GCT GCT 


69 


15 


9 


AGA AGA A6A AGA AGA AGA AGA 


39 


41 


GGA GGA GGA GGA GGA GGA GGA 


70 




10 


AGC AGC AGC AGC AGC AGC AGC 


40 


42 


GGC GGC GGC GGC GGC GGC GGC 


71 




11 


AGG AGG AGG AGG A66 AGG AGG 

nuu nu*j mjv miu 


41 


43 


AG AG AG AG AG AG AG AG AG AG 


72 




12 


AGT A6T AGT AGT AGT AGT AGT 


42 


44 


GGT GGT GGT GGT GGT GGT GGT 


73 




13 


ATA ATA ATA ATA ATA ATA ATA 


43 


45 


GTA GTA GTA GTA 6TA GTA GTA 


74 


20 


14 


ATC ATC ATC ATC ATC ATC ATC 

niv Fiiw niw niw niw 


44 


46 


GTC GTC GTC GTC GTC GTC GTC 


75 




^3 


ATR ATC ATC ATG ATG ATG AT6 


45 


47 


GTG GTG GTG GTG GTG GTG GTG 


76 




lO 


ATT ATT ATT ATT ATT ATT ATT 
Hit mi mi Mil ftii mi mi 


46 


48 


GTT GTT GTT GTT GTT GTT GTT 


77 




1 / 


TAA TAA TAA TAA TAA PAA TAA 
V*AA LMn UnM LnA LAM WW l»«M 


47 


49 


TAA TAA TAA TAA TAA TAA TAA 

1 rv% inn inn inn inn inn inn 


78 




1A 

ID 


TAr TAr TAr pap pap pap cap 

LnU Unl« wnw WIL UMtU Vnh> 


48 


50 


TAC TAC TAC TAC TAC TAC TAC 


79 




10 

It 


PAfi PAH PAR PAR PAR PAC PAC 
LtMu Wlu UAli l«Au l«Mv l*nU l«nU 


26 


51 


TAG TAG TAG TAG TAG TAG TAG 


80 






PAT PAT PAT CAT CAT CAT CAT 


49 


52 


TAT TAT TAT TAT TAT TAT TAT 


81 




C 1 


PPA PPA PPA PPA PCA CCA CCA 
bUn bbn bbn wwn laUn l#i»n bUn 


50 


53 


TCA TCA TCA TCA TCA TCA TCA 


82 




22 


Cn CG CG CG CG CC CG CG CG CG 

vU WU li^U 1*U wu wu t#U 


51 


54 


TCC TCC TCC TCC TCC TCC TCC 


83 




23 


CCG CCG CCG CCG CCG CCG CCG 

WvU WWU ««wlS WvU WWU WW 


52 


55 


TCG TCG TCG TCG TCG TCG TCG 


84 


30 


24 


CCT CCT CCT CCT OCT CCT CCT 


53 


56 


TCT TCT TCT TCT TCT TCT TCT 


83 




25 


CGA CGA CGA CGA CGA CGA CGA 


54 


57 


TGA TGA TGA TGA TGA TGA TGA 


86 




26 


CGC CGC CGC CGC CGC C6C CGC 


55 


58 


TGC TGC TGC TGC TGC TGC TGC 


87 




27 


CGG CGG CGG CGG CGG CGG CGG 


56 


59 


TGG TGG TGG TGG TGG TGG TGG 


88 




28 


CGT CGT CGT CGT C6T C6T CGT 


57 


60 


TGT TGT TGT TGT TGT TGT TGT 


89 


35 


29 


CTA CTA CTA CTA CTA CTA CTA 


58 


61 


TTA TTA TTA TTA TTA TTA TTA 


90 




30 


CTC CTC CTC CTC CTC CTC CTC 


59 


62 


TTC TTC TTC TTC TTC TTC TTC 


91 




31 


CT6 CTG CT6 CTG CTG CTG CTG 


60 


63 


TTG TTG TTG TTG TTG TTG TTG 


92 




32 


CTT CTT CTT CTT CTT CTT CTT 


61 


64 


CT CT CT CT CT CT CT CT CT CT 


93 
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Oligonucleotide probes were end- labelled with ^^P- 
Gamma-dATP under standard conditions [Sambrook, J., et 
al., Molecular Cloning: A Laboratoiy Manual, Second 
Edition. Cold Spring Harbor Laboratory Press, Cold 
5 Spring Harbor (1989)]. Double stranded DNA was 
radiolabelled with ^^P -Alpha -dCTP using a Pharmacia random 
priming labelling kit according to the manufacturer's 
instructions. To improve the labelling reaction cosmid 
DNA was digested with EcoRI prior to radiolabelling. The 

10 test strips were hybridized without prehybridization in 
plastic bags containing 6 x SSCP (saline, sodium citrate- 
phosphate buffer) and 0.01 % sodium dodecyl sulfate (SDS) 
for 16 hrs. Only target-specific binding to the 
polypropylene membranes was observed, eliminating the 

15 need for a prehybridization step. The specific activity 
of the radiolabelled probes was adjusted to 5 x 10^ cpm/ml 
hybridization solution. After hybridization the test 
strips were washed in 2 x SSCP, 0.01 % SDS for 20 min. 
A variety of hybridization and wash temperatures was 

20 employed, as hereinafter described. Autoradiograms were 
developed after 5 minutes to 6 hours exposure at -70° C. 
The resulting signals on the test strips were evaluated 
visually. 

As expected, the synthetic (CAG), [SEQ ID NO: 28] 
25 oligonucleotide probe hybridized specifically at 60 °C to 
three rows of the array. These corresponded to the 
oligonucleotide repeats (CGT)^ ISEQ ID NO:57] , (GTC)^ [SEQ 
IDNO:69], and (TCG)^ [SEQ ID NO: 84] , respectively. 

Using double- stranded DNA of 200 bp and 800 bp 
30 containing a (CTOn [SEQ ID NO:29] or a {CAG)io [SEQ ID 
NO: 30] repeat resulted in a pattern of 6 bands 
corresponding to {ACG)7 [SEQ ID NO: 37], (CGA)-, [SEQ* ID 
N0:54], (CGT)7 [SEQ ID NO:57] , (GAC), [SEQ ID NO:63], 
(GTC)7 [SEQ ID N0:75], and (TCG)^ [SEQ ID NO:84] - i.e., 
35 the sense and antisense orientations. Differences in the 
signal intensity were observed between the various 
triplet -representing lanes. 
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Using the 3 kb PGR fragment containing a combined 
repeat (GCA)8+ (GCG)4 [SEQ ID N0:31] to probe the test 
strips resulted in a complete set of six oligonucleotides 
- (ACG)7 [SEQ ID N0:37] , (CGA)^ [SEQ ID NO: 54] , {CGT)^ [SEQ 
5 ID NO:57], (GAC)^ [SEQ ID NO:633 , (GTC) 7 [SEQ ID N0:75], 
and (TCG)7 [SEQ ID NO: 84] . This set represents the six 
different frame shifts for the (GCA)e [SEQ ID NO: 94] 
repeat. Additionally, the signals found with (CCG)7 [SEQ 
ID NO:52], (CGC)^ [SEQ ID NO:55], and (GCC)^ [SEQ ID 

10 NO:67lwere evident for the 3' — >5' directed frame of the 
{GCG)^ [SEQ ID NO: 95] repeat. No signal was detected 
under these conditions for the reversed direction 
indicated by (GGC)^ [SEQ ID NO: 71] , {GCG)^ [SEQ ID NO: 68] , 
and (CGC)7 [SEQ ID NO: 55] . 

15 Using cosmid MDY2 with an insert size of 31 Jcb as a 

probe, a band pattern was observed indicative of the 
presence of {CAG)n, {GCC)„, and (CCT)^ repeats, 
respectively. For the (CCT)„, only one direction of the 
oligonucleotide frame as represented by (CCT)7 [SEQ ID 

20 NO:53], (CTC)7 [SEQ ID NO:59], and (TCC) 7 [SEQ ID NO: 83] 
was found to hybridize. A search of 11,613 nt available 
sequence information (GenBank accession L00727) of cosmid 
MDY2 revealed the presence of all types of triplet 
repeats identified by the test strip (Table 3). The 

/25 repeated triplet numbers vary from 3 for the CCT and GCG 
type repeats to 11 for the CTG repeat. 

TABLE 3 



Position 
(Nucleotide #) 


STR 


809-817 


(CCT) 3 


8,172-8,180 


(CCT) 3 


9,093-9,101 


(C3GA)3 


10,364-10,372 


(GCOj 


10,677-10,709 


(CTG) 11 
[SEQ ID NO: 29) 



The influence of temperature on the STR detection 
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was evaluated by hybridizing cosmid 22.3 at 40°C, 50°C, 
and 60*^C to the test strips. The strips were then washed 
at the temperatures used for the respective 
hybridizations. At 40°C, a band pattern was obtained 
5 indicative of (CA)„, {ACO^, {CCT)^ and (GCC)j, type 
.repeats, respectively (Fig. 2A) ; for the triplet repeats, 
only one set of signals representing one direction of the 
oligonucleotide frame was observed. The pattern at 50°C 
was also specific fox (CA)n/ (ACC)„, and {GCC)n type 

10 repeats (Fig. 2B) ; however, unlike the pattern at 40°C 
the signals representing a (TCO^ type repeat disappeared 
and additional bands indicative of a (CGT)^ type repeat 
occurred; again, for the trinucleotide repeats only one 
set of signals reprtisenting one direction of the 

15 respective oligonucleotide frame was found. At 60°C only 
signals representing (CA)^ and (CCG)n type repeats 
persisted (Fig. 2C) . Under these hybridization 
conditions a full set of the expected 6 bands evident for 
a (CCG)n type repeat was observed. 

20 All types of STRs indicated by the test strips were 

found to be present in 34,977 bp sequence available from 
cosmid 22.3 (Fig. 3) . They range from a single repeat 
unit of (CCT)3 and (AGOa up to 18 repeat units for a 
(TG) -dinucleotide repeat. There was no unspecific 

25 hybridization signal observed. AT-rich repeats also 
occurring in the sequence in three or less repeat units 
were not detected by the test strips under the 
hybridization conditions used. 

The array was designed to represent trinucleotide 

30 repeats by all three possible frames in 3'-">5' 
direction, as well as in the reverse direction. Thus, 
using single stranded DNA a complementary sequence to a 
given trinucleotide repeat should result in three signals 
on the test strips; using double stranded DNA six 

35 respective bands for a given repeat should occur. For 
the four dinucleotide repeats only one frame was used for 
each type. Using this reverse blotting system, the 
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obtained band pattern provided qualitatively the precise 
identification of previously known STRs in DNA samples of 
various complexities between 21 bp - 34,977 bp. 
Moreover, there was no random or cross hybridization to 
5 unspecific sequences observed. Based on the Tm and size 
. of the STRs as well as possible influences by flanking 
sequences, varying the hybridization stringency can 
enhance the specificity. 

From the foregoing description, one skilled in the 

10 art can readily ascertain the essential characteristics 
of the invention and, without departing from the spirit 
and scope thereof, can adapt the invention to various 
usages and conditions. Changes in form and substitution 
of equivalents are contemplated as circumstances may 

15 suggest or render expedient, and although specific terms 
have been employed herein, they are intended in a 
descriptive sense and not for purposes of limitation. 
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5EQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Beckinan Instruments, Inc. 

(ii) TITLE OF INVENTION: OLIGONUCLEOTIDE REPEAT ARRAYS 
(iii) NUMBER OF SEQUENCES: 95 



15 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Robbins, Berliner & Carson 

(B) STREET: 201 North Figueroa Street, Suite 500 

(C) CITY: Los Angeles 

(D) STATE: CA 
20 (E) COUNTRY: USA 

(F) ZIP: 90012 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
25 (B) COMPUTER: IBM PC conpatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 
30 (A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
35 (A) NAME: Spitals, John P. 

(B) REGISTRATION NUMBER: 29,215 

(C> REFERENCE/DOCKET NUMBER: 5727-110 

(fx) TELECOMMUNICATION INFORMATION: 
40 (A) TELEPHONE: (213) 977-11001 

(B) TELEFAX: (213) 977-1003 



(2) INFORMATION FOR SEQ ID N0:1: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
50 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic add 



•55 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CCCACGACGA CG 12 
60 (2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 
65 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 
CCCGACGACG AC 12 
5 (2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



20 



30 



50 



60 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 
ACGACGATCA TC 12 
(2) INFORMATION FOR SEQ ID N0:4: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

35 ACATACATAC AT 12 

(2) INFORMATION FOR SEQ ID N0:5: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 
AAAAAAAAAA AAAAA 15 
(2) INFORMATION FOR SEQ ID N0:6: 



•55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 
AAAAACAACA ACAAC 



15 
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(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 
ACAAAAAAAA AAAAA 15 
(2) INFORMATION FOR SEQ ID N0:8: 



20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

TTTAAAAAAA AAAAA 15 
(2) INFORMATION FOR SEQ ID N0:9: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
TTTTTTTTTT TTTTT 15 
50 (2) INFORMATION FOR SEQ ID N0:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10: 
AAAAAAAAAA AAAAAAAAAA A 21 
(2) INFORMATION FOR SEQ ID N0:11: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
70 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:11: 
CCCCCCCCCC CCCCCCCCCC C 21 
(2) INFORMATION FOR SEQ ID N0:12: 



(f) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGGGGGGGGG GGGGGGGGGG G 21 
25 (2) INFORMATION FOR SEQ ID NO: 13: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13: 
TTTTTTTTTT TTTTTTTTTT T 21 
(2) INFORMATION FOR SEQ ID N0:14: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 

55 AACAACAACA ACAACAACAA C 21 

(2) INFORMATION FOR SEQ ID N0:15: 

({) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 
ACAACAACAA CAACAACA 18 
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(2) INFORMATION FOR SEQ ID N0:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11 } MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

15 CAACAACAAC AACAACAA 18 

(2) INFORMATION FOR SEQ ID N0:17: 

<i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 
TGCTGCT6CT GCTGCT6C 18 
(2) INFORMATION FOR SEQ ID NO: 18: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NQ:18: 

TGCT6CTGCT GCTGT6TGTG TGTGTTATTA TTATTA 36 
(2) INFORMATION FOR SEQ ID NO: 19: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
• 55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TGGTGGTGGT GGTGATGATG ATGA 
65 (2) INFORMATION FOR SEQ ID NO:20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii> MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20: 
TCGTCGTCGT CGTCGTCG 16 
(2) INFORMATION FOR SEQ ID N0:21: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 <0) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 
G6CGGCGGCG GCGGC 15 
25 (2) INFORMATION FOR SEQ ID N0:22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

CGCGCGCGGA CGACGACGAC 20 
(2) INFORMATION FOR SEQ ID N0:23: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:23: 

55 TCATCATCAT CATAATAATA ATAATACTAC TACTAC 36 

(2) INFORMATION FOR SEQ ID N0:24: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



70 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:24: 
CCACCACCAC CACAACAACA ACAA 



24 
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(2) INFORMATION FOR SEQ ID NO:25: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: Single 
(0) TOPOLOGY: linear 

(fi) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEO ID NO:25: 

15 TTCTTCTTCT TCTCTCTCTC TCTCTCCTCC TCCTCC 

(2) INFORMATION FOR SEQ ID N0:26: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (i1) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:26: 
GAAGAAGAAG AAGAGAGAGA GAGAGGAGGA GGAG6A 
(2) INFORMATION FOR SEQ ID NO:27: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:27: 

GCCGCCGCCG CCGCAGCAGC AGCA 
(2) INFORMATION FOR SEQ ID N0:28: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
•55 (D) TOPOLOGY: linear 

(11) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28: 
CAGCAGCAGC AGCAGCAGCA G 
65 (2) INFORMATION FOR SEQ 10 N0:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other nucleic acid 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:29: 
CTGCTGCTGC TGCTGCTGCT GCTGCTGCTG CTG 
(2) INFORMATION FOR SEQ ID NO:30: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



<xi) SEQUENCE DESCRIPTION: SEQ ID N0:30: 

CAGCAGCAGC AGCAGCAGCA GCAGCA6CAG 

25 (2) INFORMATION FOR SEQ ID N0:31: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
<B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31: 
GCAGCA6CA6 CAGCAGCAGC AGCAGCGGCG GCGGCG 
(2) INFORMATION FOR SEQ ID Nd:32: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



50 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:32: 

* 5 5 ACACACACAC ACACACACAC 

(2) INFORMATION FOR SEQ ID N0:33: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



70 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:33: 
AAGAAGAAGA AGAAGAAGAA G 
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(2) INFORMATION FOR SEQ ID N0:34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(if) MOLECULE TYPE: Other nucleic acid 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

15 AATAATAATA ATAATAATAA T 21 

(2) INFORMATION FOR SEQ ID NO:35: 

(!) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: Other nuclei^; acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
ACAACAACAA CAACAACAAC A 21 
(2) INFORMATION FOR SEQ ID N0:36: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(11) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:36: 

ACCACCACCA CCACCACCAC C 
(2) INFORMATION FOR SEQ ID N0:37: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
' 55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:37: 

ACGACGACGA CGACGACGAC G 21 

65 (2) INFORMATION FOR SEQ ID N0:38: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
(8) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:38: 
ACTACTACTA CTACTACTAC T 21 
(2) INFORMATION FOR SEO ID N0:39: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:39: 

AGAAGAAGAA GAAGAAGAAG A 21 . 

25 (2) INFORMATION FOR SEQ ID N0:40: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
(6) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:40: 
AGCAGCAGCA GCAGCAGCAG C 
(2) INFORMATION FOR SEQ ID N0:41: 



(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 

55 AGGAGGAGGA GGAG6AGGAG 6 21 

(2) INFORMATION FOR SEQ ID N0:42: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEO ID N0:42: 
AGTAGTAGTA GTAGTAGTAG T 21 
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(2> INFORHATtON FOR SEO ID NO:43: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

10 



(xi) SEQUENCE DESCRIPTION: SEO ID NO: A3: 

15 ATAATAATAA TAATAATAAT A 21 

(2) INFORMATION FOR SEQ ID N0:44: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: Other nucleic acid 



30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:44: 
ATCATCATCA TCATCATCAT C 
(2) INFORMATION FOR SEQ ID NO:45: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 

ATGATGATGA TGATGATGAT G 21 
(2) INFORMATION FOR SEQ ID NO:46: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: 
ATTATTATTA TTATTATTAT T 21 
65 (2) INFORMATION FOR SEQ ID N0:47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



wo 95/30774 



PCTAJS95/tM899 



10 



20 



35 



40 



50 



-37- 

(ii) KOLECULE TYPE: Other nucleic acid 

(Xi) . SEQUENCE DESCRIPTION: SEQ ID N0:47: 
CAACAACAAC AACAACAACA A 21 
(2) INFORHATION FOR SEQ ID N0:48: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: Mnear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:48: 
CACCACCACC ACCACCACCA C 
25 (2) INFORMATION FOR SEQ ID N0:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49: 

CATCATCATC ATCATCATCA T 21 
(2) INFORMATION FOR SEQ ID NOrSO: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



MOLECULE TYPE: Other nucleic acid 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO:50: 

•55 CCACCACCAC CACCACCACC A 

(2) INFORMATION FOR SEQ ID N0:51: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



70 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51: 
CGCGCGCGCG CGCGCGCGCG 



20 
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(2) INFORHATION FOR SEQ ID NO:52: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:52: 

15 CCGCCGCCGC C6CCGCCGCC G 21 

(2) INFORMATION FOR SEQ ID NO:53: 

<i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (11) MOLECULE TYPE: Other nucleic acid 



30 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: 
CCTCCTCCTC CTCCTCCTCC T 21 
(2) INFORMATION FOR SEQ ID NO:54: 



35 (1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(11) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54: 

CGACGACGAC GACGACGACG A 
(2) INFORMATION FOR SEQ ID NO:55: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (0) TOPOLOGY: linear 

(11) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 
CGCCGCCGCC GCCGCCGCCG C 21 
65 (2) INFORMATION FOR SEQ ID NO:56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(li) HOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 
CGGCGGCGGC GGCGGCGGCG G 21 
(2) INFORMATION FOR SEQ ID N0:57: 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 
CGTCGTCGTC GTCGTC6TCG T 21 
25 (2) INFORMATION FOR SEQ ID N0:S8: 

( i } SEQUENCE CHARACTER I ST I CS : 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 
CTACTACTAC TACTACTACT A - 
(2) INFORMATION FOR SEQ ID N0:59: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:59: 

•55 CTCCTCCTCC TCCTCCTCCT C 21 

(2) INFORMATION FOR SEQ ID N0:60: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:60: 
CTGCT6CTGC T6CT6CTGCT G 21 
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(2) INFORMATION FOR SEQ ID N0:61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii)-MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEO ID N0:61: 

15 CTTCTTCTTC TTCTTCTTCT T 21 

(2) INFORMATION FOR SEQ ID N0:62: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (f1) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:62: 
GAAGAAGAAG AAGAAGAAGA A 
(2) INFORMATION FOR SEO ID N0:63: 



35 (i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:63: 

GACGACGACG ACGACGACGA C 
(2) INFORMATION FOR SEQ ID N0:64: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
•55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:64: 
GAGGAGGAGG AGGAGGAGGA G 
65 (2) INFORMATION FOR SEQ ID NO:65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: Other nucleic acid 



5 (xi) SEQUENCE DESCRIPTION: SEO ID N0:65: 

GATGATGATG ATGATGATGA T 
(2) INFORMATION FOR SEQ ID N0:66: 

10 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



20 

(xi) SEQUENCE DESCRIPTION: SEO ID N0:66: 
GCAGCAGCAG CAGCAGCAGC A 
25 (2) INFORMATION FOR SEQ ID N0:67: 

({) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:67: 
GCCGCCGCCG CCGCCGCCGC C 

40 

(2) INFORMATION FOR SEQ ID N0:68: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

50 . 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:68: 

•55 GCGGCGGCGG CGGCGGCG6C G 

(2) INFORMATION FOR SEQ ID N0:69: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



70 



(xi) SEQUENCE DESCRIPTION: SEO ID N0:69: 
GCTGCTGCTG CTGCTGCTGC T 
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10 



30 



40 



(2) INFORMATION FOR SEQ ID NO:70: 

({) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: micleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:70: 

15 GGAGGAGGAG GAGGAGGAGG A 21 

(2) INFORMATION FOR SEQ ID N0:71: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:71: 
GGCGGC6GCG GCGGCGGCGG C 
(2) INFORMATION FOR SEQ ID N0:72: 



35 (1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:72: 

AGAGA6AGAG AGAGAGAGAG 20 
(2) INFORMATION FOR SEQ ID N0:73: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
•55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
GGTGGTGGT6 GT6GT6GTGG T 21 
65 (2) INFORMATION FOR SEQ ID N0:74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) HOLECULE TYPE: Other nucleic c^cid 



5 (xi) SEQUENCE DESCRIPTION: SEQ IH N0:74: 

GTA6TAGTA6 TA6TAGTAGT A 
(2) INFORMATION FOR SEQ ID N0:75: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ lb N0:75: 
GTCGTCGTCG TCGTCGTCGT C 
25 (2) INFORMATION FOR SEQ ID N0:76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic ?cid 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID v:0:76: 
GTGGTGGTGG TGGTGGT6GT G 

40 

(2) INFORMATION FOR SEQ ID N0:77: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic ncid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:77: 

•55 GTTGTTGTTG TT6TT6TTGT T . 

(2) INFORMATION FOR SEQ ID NO:78: 

(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

65 (ii) MOLECULE TYPE: Other nucleic acid 



70 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:78: 
TAATAATAAT AATAATAATA A 
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(2) INFORMATION FOR SEQ ID N0:79: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD> TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:79: 

15 TACTACTACT ACTACTACTA C 

(2) INFORMATION FOR SEQ ID N0:80: 

(1) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (11) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:80: 

30 

TAGTAGTAGT AGTA6TAGTA G 

(2) INFORMATKM FOR SEQ ID N0:81: 

35 (1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:81: 

TATTATTATT ATTATTATTA T 
(2) INFORMATION FOR SEQ ID N0:82: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:82: 
TCATCATCAT CATCATCATC A 
65 (2) INFORMATION FOR SEQ ID N0:S3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: Other nucleic acid 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:83: 

TCCTCCTCCT CCTCCTCCTC C 



10 (2) INFORMATION FOR SEQ ID N0:84: 

({) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: rujcieic acid 
15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



25 



35 



(ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:84: 

TCGTCGTC6T CGTCGTCGTC 6 21 
<2) INFORMATION FOR SEQ ID N0:85: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:85: 

40 TCTTCTTCTT CTTCTTCTTC T 21 

(2) INFORMATION FOR SEQ ID N0:86: 

({) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: Other nucleic acid 



55 



65 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:86: 
TGATGATGAT GATGATGATG A 21 
(2) INFORMATION FOR SEQ ID N0:87: 



60 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



70 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:87: 

TGCTGCTGCT GCTGCTGCTG C 
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30 



40 



(2) INFORMATION FOR SEQ ID N0:88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) HOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:88: 

15 TGGTGGTGGT GGTGGTGGT6 G 

<2) INFORMATION FOR SEQ ID N0:89: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89: 
TGTTGTT6TT GTTGTTGTTG T 
(2) INFORMATION FOR SEQ ID N0:90: 



35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: Other nucleic acid 



45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:90: 

TTATTATTAT TATTATTATT A 
(2) INFORMATION FOR SEQ ID N0:91: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



60 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:91: 
TTCTTCTTCT TCTTCTTCTT C 
65 (2) INFORMATION FOR SEQ ID N0:92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
70 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



wo 95/30774 



<ii) MOLECULE TYPE: Other nucleic acid 



5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: 

TTGTTGTTGT TGTTGTTGTT 6 
(2) INFORMATION FOR SEQ ID N0:93: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:93: 
CTCTCTCTCT CTCTCTCTCT 
25 (2) INFORMATION FOR SEQ ID N0:94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:94: 
GCAGCAGCAG CAGCAGCAGC AGCA 

40 

(2) INFORMATION FOR SEQ ID N0:95: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: Other nucleic acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:95: 
GCGGCGGCGG CG 
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WHAT IS CLAIMED IS: 

1. Apparatus for identifying tandem nucleotide 
repeats in a sample containing same, comprising: 

a solid support; and 
5 a plurality of oligonucleotides fixedly 

attached thereto to form an array, the 
oligonucleotides defining a set of tandem nucleotide 
repeats including sequences complementary to tandem 
nucleotide repeats in the sample, the array 
10 establishing a pattern such that identity of a 

tandem nucleotide repeat in the sample may be 
ascertained by location in the pattern of the 
complementary tandem nucleotide repeat upon 
hybridization of the sample to the array. 

15 

2. Apparatus according to claim 1, wherein the 
array comprises a linear sequence of oligonucleotides. 

3. Apparatus according to claim 1, wherein the 
20 array comprises a two-dimensional pattern. 

4. Apparatus according to claim 1, wherein the 
oligonucleotides have the formula (Nmer)^, in which N is 
an integer greater than 1 and represents the number of 

25 nucleotides in a repeat pattern and n is an integer from 
two to about 50. 

5. i^paratus according to claim 4, wherein the set 
of tandem nucleotide repeats comprises a complete set of 

30 (3mer)n repeats excluding homopolymers . 

6. Apparatus according to claim 1, wherein the set 
of tandem nucleotide repeats comprises a complete set of 
(2mer)n repeats excluding homopolymers. 
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7. Apparatus according to claim 1, wherein the set 
of tandem nucleotide repeats comprises a set of 
oligonucleotides of the formula 

[(Nmer)„(Mmer) J]^ 

5 or 

[ {Nmer)„(Mmer)„(Pmer)p]x 
in which each of N, M and P is independently 
selected from the set of integers greater than 1 and 
represent the number of nucleotides in a repeat 
10 pattern; 

each of n, m and p is independently selected 
from the set of integers; and 

X is an integer, with the proviso that x(Nn + 
Mm) or x(Nn + Mm + Pp) is between 4 and about 100. 

15 

8. Apparatus according to claim 1, wherein the 
solid support comprises a material selected from glass 
and plastic films. 

20 9. Apparatus according to claim 8, wherein the 

solid support comprises polypropylene. 

10. A method for identifying tandem nucleotide 
repeats in a sample containing same, comprising: 

25 bringing the sample into contact under 

hybridization conditions with an array comprising a 
plurality of oligonucleotides fixedly attached to a 
solid support, the oligonucleotides defining a set 
of tandem nucleotide repeats including sequences 

30 complementary to tandem nucleotide repeats in the 

sample, the array establishing a predetermined 
pattern; and 

identifying the tandem nucleotide repeats in 
the sample by determining location in the pattern of 

35 the complementary tandem nucleotide repeats to which 

the tandem nucleotide repeats are hybridized. 
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li. A method according to claim 10, wherein the 
sample is labelled prior to bringing it into contact with 
the array. 

5 12. A method according to claim 11, wherein the 

label is selected from the group consisting of 
radioactive labels and fluorescent labels. 

13. A method according to claim 10, wherein the 
10 oligonucleotides have the formula (Nmer)^, in which N is 
an integer greater than 1 and represents the number of 
nucleotides in a repeat pattern and n is an integer from 
two to about 50. 



15 14. A method according to claim 10, wherein the set 

of tandem nucleotide repeats comprises a complete set of 
(3mer)n repeats excluding homopolymers . 

15. A method according to claim 10, wherein the set 
20 of tandem nucleotide repeats comprises a complete set of 

(2mer)n repeats excluding homopolymers. 

16. A method according to claim 10, wherein the set 
of tandem nucleotide repeats comprises a set of 

25 oligonucleotides of the formula 

[(Nraer)„(Mmer)J]^ 

or 

[ {Nmer)n(Mmer)„(Pmer)p]x 
in which each of N, M and P is independently 
30 selected from the set of integers greater than 1 and 

represent the number of nucleotides in a repeat 
pattern; 

each of n, m and p is independently selected 
from the set of integers; and 
35 X is an integer, with the proviso that x(Nn + 

Mm) or x(Nn + Mm + Pp) is between 4 and about 100. 
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17. A method for determining size of tandem 
nucleotide repeats in a sample containing same, 
comprising: 

bringing the sample into contact under 
5 hybridization conditions with an array comprising a 

plurality of oligonucleotides fixedly attached to a 
solid support, the oligonucleotides defining a set 
of tandem nucleotide repeats including sequences 
complementary to tandem nucleotide repeats in the 
10 sample, the array establishing a predetermined 

patterns- 
identifying the tandem nucleotide repeats in 
the sample by determining location in the pattern of 
the complementary tandem nucleotide repeats to which 
15 the tandem nucleotide repeats are hybridized to form 

hybrids ; 

selectively dissociating the hybrids to elute 
the tandem nucleotide repeat from the support; and 
sizing the eluted tandem nucleotide repeats. 



18. A method according to claim 17, wherein the 
eluted tandem nucleotide repeats are sized by gel 
electrophoresis against DNA size markers. 
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